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(57) Abstract: An audio/video 

reproducing apparatus is connectable to 
a communications network for selectively 
reproducing items of audio/video 
material from a recording medium in 
response to a request received via the tele 
communications network. The audio/video 
reproducing apparatus may comprise 
a control processor operable in use to 
receive data representing the request for 
the audio/video material item via the 
communications network. A reproducing 
processor is operable in response to signals 
identifying the audio/video material items 
from the control processor to reproduce 
the audio/video materia] items. The 
data identifying the audio/video material 
items includes meta data indicative of the 
audio/video material items. The meta data 
may be one of UM1D, tape ID and time 
codes, and a Unique Material Identifier 
the material items. To facilitate the 
identification and selection of the audio 
and/or video material, an audio and/or 
video processing apparatus is provided 
for processing audio and/or video signals 
representing sound and/or images. The processing apparatus comprises an activity detector operable to generate an activity signal 
indicative of an amount of activity within the sound/images represented by the audio/video signal, and a meta data generator coupled 
to the activity detector which is operable to generate sample images at temporal positions within the audio/video signal, which 
temporal positions are determined from the activity signal. The processing apparatus thereby provides a facility for automatically 
generating meta data from received audio/video signals. The meta data can be used to select the audio/video material. 
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AUDIO/VIDEO REPRODUCING APPARATUS AND METHOD 

Field of the Invention 

The present invention relates to audio/video reproducing apparatus and 
methods of reproducing audio/video material. 
5 The present invention also relates to video processing apparatus, audio 

processing apparatus and methods of processing video signals and audio signals. 

The present invention also relates to editing systems for combining items of 
audio/video material to form audio/video productions. The present invention also 
relates to methods of generating audio/video productions. 
10 Background of the Invention 

Editing is a process in which items of audio/video material are combined to 
form an audio/video production. Generally audio/video material items are captured 
from a source in accordance with a pre-determined plan. However, typically many 
audio/video material items are not used in the edited version of the audio/video 

15 production. For example, a television program, such as a high quality drama, may be 
formed from a combination of takes of audio/video material items from a single 
camera. As such, in order to form the program, several takes are combined in order to 
form a flow required by the story of the drama. Furthermore several takes may be 
generated for each scene but only a selected number of these takes are combined in 

20 order to form the scene. 

The term audio and/or video will be referred to herein as audio/video and 
includes any form of information representing sound or visual images or a 
combination of sound and visual images. 

In a post production process the items of audio/video material are selectively 

25 combined by the editor to form the audio/video production. However in order to select 
the required audio/video material items to form the production, the editor must review 
the items of audio/video material that have been generated. This is a time consuming 
and arduous task, particularly when a linear recording medium, such as a video tape 
has been used to record the audio/video material items. 
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In general the quality of the images represented on the recording medium, to 
the extent that the images and/or sound represent the original source is arranged to be 
as high as possible. This means that an amount of information that must be store to 
represent these images and/or sound is relatively high. As a result, the images and/or 
5 sound cannot be readily accessed so that the content of the audio/video material items 
cannot be easily ascertained once recorded. This is particularly so, if a format in 
which the images and sounds are represented is compressed in some way. For 
example video cameras and camcorders are arranged conventionally to record a video 
signals representing the moving images on a video tape. Once the video signals have 

10 been recorded on to the video tape, a user cannot determine the content of the video 
tape without reviewing the entire tape. Furthermore, because video tape is an example 
of a linear recording medium, the task of navigating through the media to locate 
particular content items of video material is time consuming and labour intensive. As 
a result during an editing process in which selected items from the contents of the 

1 5 video tape are combined in an order which may be different to that in which they were 
recorded, it may be necessary to review the entire contents of the video tape in order to 
identify the selected items. 
Summary of Invention 

According to the present invention there is provided an audio/video 
20 reproducing apparatus connectable to a communications network for selectively 
reproducing items of audio/video material from a recording medium in response to a 
request received via said communications network. 

By providing an audio/video reproducing apparatus which is connectable to a 
communications network, an editing facility is provided for reproducing audio/video 
25 material items, in which the items may be remotely selected. A network connection 
provides a facility for the audio/video material items to be accessed separately by more 
than one editing terminal. 

The content of video material generated by a camera is typically stored in a 
form which facilitates a high quality reproduction. In general the quality of the images 
30 represented by the video signal, to the extent that the images reflect an original image 
source falling within the field of view of the camera, is arranged to be as high as 
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possible. This means that an amount of information that must be store to represent 
these images is relatively high. This in turn requires that the video signal is stored in a 
format that does not readily allow access to the content of the video signals. This is 
particularly so, if the video signal is compressed in some way. For example, video 
5 cameras and camcorders are arranged conventionally to record a video signals 
representing moving images on to a video tape. Once the video signals have been 
recorded on to the video tape, a user cannot readily determine the content of the video 
tape without reviewing the entire tape. Alternatively, the contents of the recording 
medium may be ingested to provide substantially non-linear access to the audio/video 

10 material. However this is time consuming, particularly for example for a linear 
recording medium. Therefore by providing a facility for accessing the audio/video 
. material items via a network, the items may be selectively accessed via the network, 
without being ingested and without having to review the entire of the tape. 

In preferred embodiments the audio/video reproducing apparatus may comprise 

15 a control processor which is arranged in use to receive data representing requests for 
audio/video material items via the communications network, and a reproducing 
processor coupled to the control processor and arranged in response to signals 
identifying the audio/video material items from the control processor to reproduce the 
audio/video material items, which are communicated via the communications network. 

20 The task of navigating through the media to locate particular content items of 

video material is time consuming and labour intensive. As a result during an editing 
process in which selected items from the contents of the video tape are combined in an 
order which may be different to that in which they were recorded, it may be necessary 
to review the entire contents of the video tape in order to identify the selected items. 

25 Hence by identifying the audio/video material items required and reproducing only the 
items identified, an advantage is provided in respect of the time taken to edit an 
audio/video production. 

In order to receive commands identifying the audio/video material items and to 
communicate the audio/video material items, the audio/video reproducing apparatus 

30 may comprise a first network interface connectable to a first communications network 
for receiving the data representing the requests for audio/video material, and a second 
network interface connectable to a second communications network for 
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communicating the items of audio/video material. By providing a first network 
interface adapted to receive data representative of request for audio/video data and a 
second interface for communicating the items of audio/video material, the first and 
second interfaces can be optimised for the different type of data being communicated. 
5 For the audio/video material items this is particularly important because the network 
connection must stream audio/video which requires a relatively high bandwidth. As 
such in preferred embodiments, the first network interface may be arranged to operate 
in accordance with a data communications network standard such as Ethernet, RS 322 
or RS 422 or the like. Furthermore, the second network interface may be arranged to 

10 operates in accordance with the Serial Digital Interface (SDI) or the Serial Digital 
Transport Interface (SDTI). 

A particular advantage is provided by identifying the content of the audio/video 
material items so that appropriate items may be selected and ingested via the network. 
Meta data is data which serves to describe either the content of audio/video material or 

15 parameters present or used to generate the audio/video material or any other 
information associated with the audio/video material. 

In preferred embodiments, the data representing requests for audio/video 
material items includes meta data indicative of the audio/video material items. The 
meta data may be at least one of UMID, tape ID and time codes, and a Unique 

20 Material Reference Number. 

Although the reproducing apparatus may be arranged to reproduce items of 
audio/video material from a single recording medium, the reproducing processor may 
comprise a plurality of audio/video recording/reproducing apparatus each of which is 
coupled to said control processor via a local data bus. A further improvement is 

25 provided to the audio/video reproducing apparatus in accessing a plurality of recording 
media from the control process so that, for example the entire contents of a shoot from 
which the audio/video production is to be generated can be accessed via the network. 
Access may also be arranged in parallel. The recording media may also be different, 
so that some of the plurality of audio/video recording/reproducing apparatus may 

30 reproduce the audio/video items from tape and some from disc. 

In order to access the audio/video material present on the recording media, in 
preferred embodiments, the local bus may include a control communications channel 
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for communicating control data to and/or from the control processor, and a video data 
communications channel for communicating the items of audio/video material from 
the plurality of audio/video recording/reproducing apparatus to the communications 
network. 

5 To provide an indication of the contents of the audio/video material, the 

audio/video reproducing apparatus may have a display device which is arranged in 
operation to display images representative of the audio/video material items present on 
the recording medium. Furthermore to facilitate access to the audio/video material 
items, the display device may be a touch screen coupled to the control processor, and 

10 arrange in use to receive touch commands from a user for selecting the items of 
audio/video material. 

According to another aspect of the present invention there is provided a video 
processing apparatus for processing video signals representing images comprising an 
activity detector which is arranged in operation to receive the video signals and to 

15 generate an activity signal indicative of an amount of activity within the images 
represented by the video signal, and a meta data generator coupled to the activity 
detector which is arranged in operation to receive the video signal and the activity 
signal and to generate meta data representing the content of the video signals at 
temporal positions within the video signal, which temporal positions are determined 

20 from the activity, signal. 

In preferred embodiments the meta data generator is an image generator, the 
meta data generated being sample images at the temporal images within the video 
signal determined by the activity signal. 

The present invention provides a particular advantage in providing an 

25 indication of the content of video signals, at temporal positions within those signals at 
which there is activity. As a result an improvement is provided to an editing or a 
process in which the video signals are being ingested for further processing, in 
providing an visual indication from the sample images of the content of the video 
signals at temporal positions within the video signals which may be of most interest to 

30 an editor or user. 
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The sample images can provide a static representation of the moving video 
images which facilitates navigation by providing a reference to the content of the 
moving video images. 

The activity signal may be generated from generating a colour histogram of the 
5 colour components within an image and determining activity from a rate of change of 
the histogram, or from for example motion vectors for selected image components. 
The activity signal may be therefore representative of a relative amount of activity 
within the images represented by the video signal and the image detector may be 
arranged in operation to produce more of the sample images during periods of greater 
10 activity indicated by the activity signal. By arranging for more sample images to be 
generated a greater periods of activity, the information provided to an editor about the 
content of the video signals is increased, or alternatively the available resources for 
generating the sample images is concentrated on periods within the video signal of 
most interest. 

15 In order to reduce an amount of data capacity required to store and/or 

communicate the sample images, the sample images may be represented by a 
substantially reduced amount of data in comparison to the images represented by the 
video signal. 

Although the video processing apparatus may receive the video signals from an 
20 separate source, advantageously the video processing apparatus may further comprise 
a reproduction processor which is arranged in operation to receive a recording medium 
on which the video signals are recorded and to reproduce the video signals from the 
recording medium. Furthermore in preferred embodiments the image generator may 
be arranged in operation to generate, for each of the sample images a material 
25 identification representative of locations on the recording medium where the video 
signals corresponding to the sample images are recorded. This provides an advantage 
in not only providing a visual indication of the contents of a recording medium, but 
also providing with the visual indication a location at which this content is stored so 
that the video signals at this location can be reproduced for further editing. 
30 According to another aspect of the present invention there is provided an audio 

processing apparatus for processing an audio signal representing sound, the apparatus 
comprising an activity detector which is arranged in operation to receive the audio 
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signal and to generate an activity signal indicative of an amount of activity within the 
sound represented by the audio signal, and a meta data generator coupled to the 
activity detector which is arranged in operation to receive the audio signal and the 
activity signal and to generate meta data representing the content of the audio signals 
5 at temporal positions within the audio signal, which temporal positions are determined 
from the activity signal. 

According to a further aspect of the present invention there is provided an 
audio processing apparatus for processing audio signals representative of sound, the 
audio processing apparatus comprising a speech analysis processor which is arranged 

10 in operation to generate speech data identifying speech detected within the audio 
signals, an activity processor coupled to the speech analysis processor and arranged in 
operation to generate an activity signal in response to the speech data, and a content 
information generator, coupled to the activity processor and the speech analysis 
processor and arranged in operation to generate data representing the content of the 

15 speech at temporal positions within the audio signal determined by the activity signal. 

As for video signals, the present invention finds application in generating an 
indication of the content of speech present in audio signals, whereby navigation 
through the content of the audio signals is facilitated. For example, in preferred 
embodiments, the activity signed may indicative of the start of a speech sentence, so 

20 that the data representing the content of the speech provides an indication of the 
content of the start of each sentence. 

The content data can provide a static structural indication of the content of the 
audio signals which can facilitate navigation through the audio signals by providing a 
reference to the content of those signals. 

25 Although the audio processor may receive the audio signal from a separate 

source, in preferred embodiments, the reproduction processor may be arranged in 
operation to receive a recording medium on which the audio signals are recorded and 
to reproduce the audio signals from the recording medium. Furthermore, the content 
information generator may be arranged in operation to generate, for each of the content 

30 data items a material identification representative of a location on the recording 
medium where the audio signals corresponding to the content data are recorded. As 
such, an advantage is provided to an editor by associating a material identifier 
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providing the location of the audio signals on the recording medium corresponding to 
the content data, with the content data which can be used to navigate through the 
recording medium. The content data may be any convenient representation of the 
content of the speech, however, in preferred embodiments the content data is 

5 representative of text corresponding to the content of the speech. 

According to another aspect of the present invention there is provided a system 
for editing audio/video productions comprising an ingestion processor having means 
for receiving a recording medium and is arranged in use to reproduce audio/video 
material items from the recording medium, a data base operable to receive and to store 

10 meta data describing the contents of audio/video material items loaded into the 
ingestion processor, and an editing processor coupled to the ingestion processor and 
the data base, the editing processor having a graphical user interface for displaying a 
representation of the meta data stored in the data base and for selecting the audio/video 
material items from the displayed representation of the meta data, the editing processor 

15 being arranged to combine user selected items of audio/video material, which are 
selectively reproduced by the ingestion processor in response to meta data 
corresponding to the selected items of audio/video material being communicated to the 
ingestion processor by the editing processor. 

As already explained, during acquisition, once the signals representing the 

20 audio/video material items have been recorded on to the recording medium, a user 
cannot readily determine the content of the audio/video material items without 
reproducing the items from the recording medium. Alternatively, the contents of the 
r recording medium may be ingested to provide substantially non-linear access to the 
audio/video material. This is time consuming, particularly for example for a linear 

25 recording medium. However by providing access to meta data which may be 
generated at acquisition of the audio/video material, and which describes the content of 
the material, an editing system may select and only reproduce items of audio/video 
material from the recording medium which are required for the edited audio/video 
production. As such the editing process is made more efficient by only ingesting 

30 audio/video material items which are required for the audio/video production. 

Advantageously, the editing processor may be coupled to the data base and to 
the ingestion processor via a data communications network. The communications 
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network provides a facility for accessing the meta data and the audio/video material 
items remotely. Additionally, more than one editing processor may be coupled to the 
comminations network thereby providing a facility for the matadata in the data base 
and the audio/video material to be selectively accessed, whereby editing of more than 
5 one audio/video production may be edited contemporaneously. 

In preferred embodiments, the data communications network may comprise a 
first communications network coupled to the editing station, the data base and the 
ingestion processor for communicating the meta data, and a second communications 
network coupled to the editing station, the data base and the ingestion processor for 

10 communicating the items of audio/video material. By providing a first 
communications channel adapted to receive data representative of requests for 
audio/video data and a second communications channel for communicating the items 
of audio/video material, the first and second interfaces can be optimised for the 
different type of data being communicated. For the audio/video material items this is 

1 5 advantageous because the network connection must stream audio/video which requires 
a relatively high bandwidth. As such in preferred embodiments, the first network 
interface may be arranged to operate in accordance with a data communications 
network standard such as Ethernet, RS 322 or RS 422 or the like. Furthermore, the 
second network interface may be arranged to operates in accordance with the Serial 

20 Digital Interface (SDI) or the Serial Digital Transport Interface (SDTI). 

In preferred embodiments, the meta data may be one of a UMID, tape ID and 
time codes, and a Unique Material Reference Number, identifying the material items. 

As mentioned above, the meta data may be generated with the audio/video 
material items during acquisition. As such, the recording medium may include the 

25 meta data describing the content of the audio/video material items recorded on to the 
recording medium, and the ingestion processor may be arranged in operation to 
reproduce the meta data and to communicate the meta data via the network to the data 
base, the data base operating to receive and to store the meta data. 

A particular advantage is provided by identifying the content of the audio/video 

30 material items so that appropriate items may be selected and ingested via the network. 

The term meta data as used herein refers to and includes any form of 
information or data which serves to describe either the content of audio/video material 
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or parameters present or used to generate the audio/video material or any other 
information associated with the audio/video material. Meta data may be, for example, 
"semantic meta data" which provides contextual/descriptive information about the 
actual content of the audio/video material. Examples of semantic meta data are the 

5 start of periods of dialogue, changes in a scene, introduction of new faces or face 
positions within a scene or any other items associated with the source content of the 
audio/video material. The meta data may also be syntactic meta data which is 
associated with items of equipment or parameters which were used whilst generating 
the audio/video material such as, for example, an amount of zoom applied to a camera 

10 lens, an aperture and shutter speed setting of the lens, and a time and date when the 
audio/video material was generated. Although meta data may be recorded with the 
audio/video material with which it is associated, either on separate parts of a recording 
medium or on common parts of a recording medium, meta data in the sense used 
herein is intended for use in navigating and identifying features and essence of the 

15 content of the audio/video material, and may, therefore be separated from the 
audio/video signals when the audio/video signals are reproduced. The meta data is 
therefore separable from the audio/video signals. 

Various further aspects and features of the present invention are defined in the 
appended claims. 
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Brief Description of Drawings 

Embodiments of the present invention will now be described by way of 
example with reference to the accompanying drawings wherein: 

Figure 1 is a schematic block diagram of a video camera arranged in operative 
5 association with a Personal Digital Assistant (PDA), 

Figure 2 is a schematic block diagram of parts of the video camera shown in 
figure 1, 

Figure 3 is a pictorial representation providing an example of the form of the 
PDA shown in figure 1, 
10 Figure 4 is a schematic block diagram of a further example arrangement of 

parts of a video camera and some of the parts of the video camera associated with 
generating and processing meta data as a separate acquisition unit associated with a 
further example PDA, 

Figure 5 is a pictorial representation providing an example of the form of the 
1 5 acquisition unit shown in figure 4, 

Figure 6 is a part schematic part pictorial representation illustrating an example 
of the connection between the acquisition unit and the video camera of figure 4, 

Figure 7 is a part schematic block diagram of an ingestion processor coupled to 
a network, part flow diagram illustrating the ingestion of meta data and audio/video 
20 material items, 

Figure 8 is a pictorial representation of the ingestion processor shown in figure 

7, 

Figure 9 is a part schematic block diagram part pictorial representation of the 
ingestion processor shown in figures 7 and 8 shown in more detail, 
25 Figure 10 is a schematic block diagram showing the ingestion processor shown 

in operative association with the database of figure 7, 

Figure 1 1 is a schematic block diagram showing a further example of the 
operation of the ingestion processor shown figure 7, 

Figure 12a is a schematic representation of the generation of picture stamps at 
30 sample times of audio/video material, 
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Figure 12b is a schematic representation of the generation of text samples with 
respect to time of the audio/video material, 

Figure 13 provides as illustrative representation of an example structure for 
organising meta data, 

5 Figure 14 is a schematic block diagram illustrating the structure of a data 

reduced UMID, and 

Figure 15 is a schematic block diagram illustrating the structure of an extended 

UMID. 
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Description of Preferred Embodiments 
Acquisition Unit 

Embodiments of the present invention relate to audio and/or video generation 
apparatus which may be for example television cameras, video cameras or camcorders. 
5 An embodiment of the present invention will now be described with reference to figure 
1 which provides a schematic block diagram of a video camera which is arranged to 
communicate to a personal digital assistant (PDA). A PDA is an example of a data 
processor which may be arranged in operation to generate meta data in accordance 
with a user's requirements. The term personal digital assistant is known to those 
10 acquainted with the technical field of consumer electronics as a portable or hand held 
- personal organiser or data processor which include an alpha numeric key pad and a 
hand writing interface. 

In figure 1 a video camera 101 is shown to comprise a camera body 102 which 
is arranged to receive light from an image source falling within a field of view of an 
15 imaging arrangement 104 which may include one or more imaging lenses (not shown). 
The camera also includes a view finder 106 and an operating control unit 108 from 
which a user can control the recording of signals representative of the images formed 
within the field of view of the camera. The camera 101 also includes a microphone 
110 which may be a plurality of microphones arranged to record sound in stereo. Also 
) shown in figure 1 is a hand-held PDA 112 which has a screen 114 and an 
alphanumeric key pad 116 which also includes a portion to allow the user to write 
characters recognised by the PDA. The PDA 112 is arranged to be connected to the 
video camera 101 via an interface 118. The interface 118 is arranged in accordance 
with a predetermined standard format such as, for example an RS232 or the like. The 
5 interface 118 may also be effected using infra-red signals, whereby the interface 1 1 8 is 
a wireless communications link. The interface 118 provides a facility for 
communicating information with the video camera 101. The function and purpose of 
the PDA 112 will be explained in more detail shortly. However in general the PDA 
112 provides a facility for sending and receiving meta data generated using the PDA 
) 112 and which can be recorded with the audio and video signals detected and captured 
by the video camera 1. A better understanding of the operation of the video camera 
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101 in combination with the PDA 1 12 may be gathered from figure 2 which shows a 
more detailed representation of the body 1 02 of the video camera which is shown in 
figure 1 and in which common parts have the same numerical designations. 

In figure 2 the camera body 102 is shown to comprise a tape drive 122 having 
5 read/write heads 124 operatively associated with a magnetic recording tape 126. Also 
shown in figure 2 the camera body includes a meta data generation processor 128 
coupled to the tape drive 122 via a connecting channel 130. Also connected to the 
meta data generation processor 128 is a data store 132, a clock 136 and three sensors 
138, 140, 142. The interface unit 118 sends and receives data also shown in figure 2 

10 via a wireless channel 119. Correspondingly two connecting channels for receiving 
and transmitting data respectively, connect the interface unit 1 1 8 to the meta data 
generation processor 128 via corresponding connecting channels 148 and 150. The 
meta data generation processor is also shown to receive via a connecting channel 151 
the audio/video signals generated by the camera. The audio/video signals are also fed 

15 to the tape drive 122 to be recorded on to the tape 126. 

The video camera 110 shown in figure 1 operates to record visual information 
falling within the field of view of the lens arrangement 104 onto a recording medium. 
The visual information is converted by the camera into video signals. In combination, 
the visual images are recorded as video signals with accompanying sound which is 

20 detected by the microphone 101 and arranged to be recorded as audio signals on the 
recording medium with the video signals. As shown in figure 2, the recording medium 
is a magnetic tape 126 which is arranged to record the audio and video signals onto the 
recording tape 126 by the read/write heads 124. The arrangement by which the video 
signals and the audio signals are recorded by the read/write heads 124 onto the 

25 magnetic tape 126 is not shown in figure 2 and will not be further described as this 
does not provide any greater illustration of the example embodiment of the present 
invention. However once a user has captured visual images and recorded these images 
using the magnetic tape 126 as with the accompanying audio signals, meta data 
describing the content of the audio/video signals may be input using the PDA 112. As 

30 will be explained shortly this meta data can be information that identifies the 
audio/video signals in association with a pre-planned event, such as a 'take'. As 
shown in figure 2 the interface unit 118 provides a facility whereby the meta data 



BNSDOCID: <WO 0175884A2_I_> 



J 



WO 01/75884 PCT/CB01/01452 

15 

added by the user using the PDA 112 may be received within the camera body 102. 
Data signals may be received via the wireless channel 119 at the interface unit 118. 
The interface unit 1 1 8 serves to convert these signals into a form in which they can be 
processed by the acquisition processor 128 which receives these data signals via the 
5 connecting channels 148, 150. 

Meta data is generated automatically by the meta data generation processor 128 
in association with the audio/video signals which are received via the connecting 
channel 151. In the example embodiment illustrated in figure 2, the meta data 
generation processor 128 operates to generate time codes with reference to the clock 

10 136, and to write these time codes on to the tape 126 in a linear recording track 
provided for this purpose. The time codes are formed by the meta data generation 
processor 128 from the clock 136. Furthermore, the meta data generation processor 
128 forms other meta data automatically such as a UMID, which identifies uniquely 
the audio/video signals. The meta data generation processor may operate in 

15 combination with the tape driver 124, to write the UMID on to the tape with the 
audio/video signals. 

In an alternative embodiment, the UMID, as well as other meta data may be 
stored in the data store 132 and communicated separately from the tape 126. In this 
case, a tape ID is generated by the meta data generation processor 128 and written on 

20 to the tape 126, to identify the tape 126 from other tapes. 

In order to generate the UMID, v and other meta data identifying the contents of 
the audio/video signals, the meta data generation processor 128 is arranged in 
operation to receive signals from other sensor 138, 140, 142, as well as the clock 136. 
The meta data generation processor therefore operates to co-ordinate these signals and 

25 provides the meta data generation processor with meta data such as the aperture setting 
of the camera lens 104, the shutter speed and a signal received via the control unit 108 
to indicate that the visual images captured are a "good shot". These signals and data 
are generated by the sensors 138, 140, 142 and received at the meta data generation 
processor 128. The meta data generation processor in the example embodiment is 

30 arranged to produce syntactic meta data which provides operating parameters which 
are used by the camera in generating the video signals. Furthermore the meta data 
generation processor 128 monitors the status of the camcorder 101, and in particular 
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whether audio/video signals are being recorded by the tape drive 124. When 
RECORD START is detected the IN POINT time code is captured and a UMID is 
generated in correspondence with the IN POINT time code. Furthermore in some 
embodiments an extended UMID is generated, in which case the meta data generation 
5 processor is arranged to receive spatial co-ordinates which are representative of the 
location at which the audio/video signals are acquired. The spatial co-ordinates may 
be generated by a receiver which operates in accordance with the Global Positioning 
System (GPS). The receiver may be external to the camera, or may be embodied 
within the camera body 102. 
10 When RECORD START is detected, the OUT POINT time code is captured by 

the meta data generation processor 128. As explained above, it is possible to generate 
. a "good shot" marker. The "good shot" marker is generated during the recording 
process, and detected by the meta data generation processor. The "good shot" marker 
is then either stored on the tape, or within the data store 132, with the corresponding 
1 5 IN POINT and OUT POINT time codes. 

As already indicated above, the PDA 112 is used to facilitate identification of 
the audio/video material generated by the camera. To this end, the PDA is arranged to 
associate this audio/video material with pre-planned events such as scenes, shots or 
takes. The camera and PDA shown in figures 1 and 2 form part of an integrated 
20 system for planning, acquiring, editing an audio/video production. During a planning 
phase, the scenes which are required in order to produce an audio/video production are 
identified. Furthermore for each scene a number of shots are identified which are 
required in order to establish the scene. Within each shot, a number of takes may be 
generated and from these takes a selected number may be used to form the shot for the 
25 final edit. The planning information in this form is therefore identified at a planning 
stage. Data representing or identifying each of the planned scenes and shots is 
therefore loaded into the PDA 1 12 along with notes which will assist the director when 
the audio/video material is captured. An example of such data is shown in the table 
below. 



A/V Production 


News story: BMW disposes of Rover 


Scene ID: 900015689 


Outside Longbridge 
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Longonuge rnvi w oign 


onox juuuuuuzuu 


w oncers .weaving smri 


oUOI 3UUUUUUZU 1 


w oncers in car pane 


Scene TC> 900015690 


RIVfW HO Munich 

.DIVA VV IIV^ lVlUiilV/11 


Shot 5000000202 


Press conference 


Shot 5000000203 


Outside BMW building 


Scene ED: 900015691 


Interview with minister 


Shot 5000000204 


Interview 



In the first column of the table below the event which will be captured by the 
camera and for which audio/video material will be generated is shown. Each of the 
events which is defined in a hierarchy is provided with an identification number. 
5 Correspondingly, in the second column notes are provided in order to direct or remind 
the director of the content of the planned shot or scene. For example, in the first row 
the audio/video production is identified as being a news story, reporting the disposal of 
Rover by BMW. In the extract of the planning information shown in the table below, 
there are three scenes, each of which is provided with a unique identification number. 

10 Each of these scenes are "Outside Long Bridge", "BMW HQ Munich" and "Interview 
with Minister". Correspondingly for each scene a number of shots are identified and 
these are shown below each of the scenes with a unique shot identification number. 
Notes corresponding to the content of each of these shots are also entered in the second 
column. So, for example, for the first scene "Outside Long Bridge", three shots are 

15 identified which are "Long Bridge BMW", "Workers leaving shift" and "Workers in 
car park". With this information loaded onto the PDA, the director or indeed a single 
camera man may take the PDA out to the place where the new story is to be shot, so 
that the planned audio/video material can be gathered. An illustration of the form of 
the PDA with the graphical user interface displaying this information is shown in 

20 figure 3. 

As indicated in figure 1, the PDA 112 is arranged to communicate data to the 
camera 111. To this end the meta data generation processor 128 is arranged to 
communicate data with the PDA 1 12 via the interface 118. The interface 118 maybe 
for example an infra-red link 1 1 9 providing wireless communications in accordance 
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with a known standard. The PDA and the parts of the camera associated with 
generating meta data which are shown in figure 2 are shown in more detail in figure 4. 

In figure 4 the parts of the camera which are associated with generating meta 
data and communicating with the PDA 112 are shown in a separate acquisition unit 
5 152. However it will be appreciated that the acquisition unit 152 could also be 
embodied within the camera 102. The acquisition unit 152 comprises the meta data 
generation processor 128, and the data store 132. The acquisition processor 152 also 
includes the clock 136 and the sensors 138, 140, 142 although for clarity these are not 
shown in figure 4. Alternatively, some or all of these features which are shown in 
10 figure 2 will be embodied within the camera 102 and the signals which are required to 
define the meta data such as the time codes and the audio/video signals themselves 
may be communicated via a communications link 153 which is coupled to an interface 
port 154. The meta data generation processor 128 is therefore provided with access to 
the time codes and the audio/video material as well as other parameters used in 
15 generating the audio/video material. Signals representing the time codes end 
parameters as well as the audio/video signals are received from the interface port 154 
via the interface channel 156. The acquisition unit 152 is also provided with a screen 
(not shown) which is driven by a screen driver 158. Also shown in figure 4 the 
acquisition unit is provided with a communications processor 160 which is coupled to 
20 the meta data generation processor 128 via a connecting channel 162. 
Communications is effected by the communications processor 160 via a radio 
frequency communications channel using the antennae 164. A pictorial representation 
of the acquisition unit 1 52 is shown in figure 5. 

The PDA 112 is also shown in figure 4. The PDA 112 is correspondingly 
25 provided with an infra-red communications port 165 for communicating data to and 
from the .acquisition unit 152 via an infra-red link 119. A data processor 166 within 
the PDA 1 12 is arranged to communicate data to and from the infra-red port 165 via a 
connecting channel 166. The PDA 112 is also provided with a data store 167 and a 
screen driver 168 which are connected to the data processor 166. 
30 The pictorial representation of the PDA 112 shown in figure 3 and the 

acquisition unit shown in figure 5 provide an illustration of an example embodiment of 
the present invention. A schematic diagram illustrating the arrangement and 
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connection of the PDA 1 12 and the acquisition unit 152 is shown in figure 6. In the 
example shown in figure 6 the acquisition unit 152 is mounted on the back of a camera 
101 and coupled to the camera via a six pin remote connector and to a connecting 
channel conveying the external signal representative of the time code recorded onto 
5 the recording tape. Thus, the six pin remote connector and the time code indicated as 
arrow lines form the communications channel 153 shown in figure 4. The interface 
port 154 is shown in figure 6 to be a VA to DN1 conversion comprising an RM- 
P9/LTC to RS422 converter 154. RM-P9 is a camera remote control protocol, whereas 
LTC is Linear Time Code in the form of an analogue signal. This is arranged to 

10 communicate with a RS422 to RS232 converter 154" via a connecting channel which 
forms part of the interface port 154. The converter 154" then communicates with the 
meta data generation processor 128 via the connecting channel 156 which operates in 
accordance with the RS 232 standard. 

Returning to figure 4, the PDA 112 which has been loaded with the pre- 

1 5 planned production information is arranged to communicate the current scene and shot 
for which audio/video material is to be generated by communicating the next shot ID 
number via the infra-red link 119. The pre-planned information may also have been 
communicated to the acquisition unit 152 and stored in the data store 132 via a 
separate link or via the infra-red communication link 119. However in effect the 

20 acquisition unit 1 52 is directed to generate meta data in association with the scene or 
shot ID number which is currently being taken. After receiving the information of the 
current shot the camera 102 is then operated to make a tc take of the shot". The 
audio/video material of the take is recorded onto the recording tape 126 with 
corresponding time codes. These time codes are received along with the audio/video 

25 material via the interface port 154 at the meta data generation processor 128. The 
meta data generation processor 128 having been informed of the current pre-planned 
shot now being taken logs the time codes for each take of the shot The meta data 
generation processor therefore logs the IN and OUT time codes of each take and stores 
these in the data store 132. 

30 The information generated and logged by the meta data generation processor 

128 is shown in the table below. In the first column the scene and shot are identified 
with the corresponding ID numbers, and for each shot several takes are made by the 
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camera operator which are indicated in a hierarchical fashion. Thus, having received 
information from the PDA 112 of the current shot, each take made by the camera 
operator is logged by the meta data generation processor 128 and the IN and OUT 
points for this take are shown in the second and third columns and stored in the data 

5 store 132. This information may also be displayed on the screen of the acquisition unit 
152 as shown in figure 5. Furthermore, the meta data generation processor 128 as 
already explained generates the UMID for each take for the audio/video material 
generated during the take. The UMID for each take forms the fourth column of the 
table. Additionally, in some embodiments, to provide a unique identification of the 

10 tape once which the material is recorded, a tape identification is generated and 
associated with the meta data. The tape identification may be written on to the tape, or 
stored on a random access memory chip which is embodied within the video tape 
cassette body. This random access memory chip is known as a TELEFILE (RTM) 
system which provides a facility for reading the tape ID number remotely. The tape 

15 ID is written onto the magnetic tape 126 to uniquely, identify this tape. In preferred 
embodiments the TELEFILE (RTM) system is provided with a unique number which 
manufactured as part of the memory and so can be used as the tape ID number. In 
other embodiments the TELEFILE (RTM) system provides automatically the IN/OUT 
time codes of the recorded audio/video material items. 

20 In one embodiment the information shown in the table below is arranged to be 

recorded onto the magnetic tape in a separate recording channel. However, in other 
embodiments the meta data shown in the table is communicated separately from the 
tape 126 using either the communications processor 160 or the infra-red link 119. The 
meta data maybe received by the PDA 112 for analysis and may be further 

25 communicated by the PDA. 



Scene ID: 900015689 


Tape ID: 00001 




UMID: 


Shot 5000000199 








Take 1 


IN: 00:03:45:29 


OUT: 00:04:21:05 


060C23B340.. 


Take 2 


IN: 00:04:21:20 


OUT: 00:04:28:15 


060C23B340.. 


Take 3 


IN: 00:04:28:20 


OUT: 00:05:44:05 


060C23B340.. 


Shot 5000000200 
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Take I 


IN: 00:05:44:10 


OUT: 00:08:22:05 


060C23B340.. 


Take 2 


IN: 00:08:22:10 


OUT: 00:08:23:05 


060C23B340.. 











The communications processor 160 may be arranged in operation to transmit 
the meta data generated by the meta data generation processor 128 via a wireless 
communications link. The meta data maybe received via the wireless communications 
5 link by a remotely located studio which can then acquire the meta data and process this 
meta data ahead of the audio/video material recorded onto the magnetic tape 126. This 
provides an advantage in improving the rate at which the audio/video production may 
be generated during the post production phase in which the material is edited. 

A further advantageous feature provided by embodiments of the present 

10 invention is an arrangement in which a picture stamp is generated at certain temporal 
positions within the recorded audio/video signals. A picture stamp is known to those 
skilled in the art as being a digital representation of an image and in the present 
example embodiment is generated from the moving video material generated by the 
camera. The picture stamp may be of lower quality in order to reduce an amount of 

15 data required to represent the image from the video signals. Therefore the picture 
stamp may be compression encoded which may result in a reduction in quality. 
However a picture stamp provides a visual indication of the content of the audio/video 
material and therefore is a valuable item of meta data. Thus, the picture stamp may for 
example be generated at the IN and OUT time codes of a particular take. Thus, the 

20 picture stamps may be associated with the meta data generated by the meta data 
generation processor 128 and stored in the data store 132. The picture stamps are 
therefore associated with items of meta data such as, for example, the time codes 
which identify the place on the tape where the image represented by the picture stamp 
is recorded. The picture stamps may be generated with the "Good Shot" markers. The 

25 picture stamps are generated by the meta data generation processor 128 from the 
audio/video signals received via the communications link 153. The meta data 
generation processor therefore operates to effect a data sampling and compression 
encoding process in order to produce the picture stamps. Once the picture stamps have 
been generated they can be used for several purposes. They may be stored in a data 
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file and communicated separately from the tape 126, or they may be stored on the tape 
126 in compressed form in a separate recording channel. Alternatively in preferred 
embodiments picture stamps may be communicated using the communications 
processor 1 60 to the remotely located studio where a producer may analysis the picture 
5 stamps. This provides the producer with an indication as to whether the audio/video 
material generated by the camera operator is in accordance with what is required. 

In a yet further embodiment, the picture stamps are communicated to the PDA 
112 and displayed on the PDA screen. This may be effected via the infra-red port 119 
or the PDA may be provided with a further wireless link which can communicate with 

10 the communications processor 160. In this way a director having the hand held PDA 
1 12 is provided with an indication of the current audio/video content generated by the 
camera. This provides an immediate indication of the artist and aesthetic quality of the 
audio/video material currently being generated. As already explained the picture 
stamps are compression encoded so that they may be rapidly communicated to the 

15 PDA. 

A further advantage of the acquisition unit 152 shown in figure 4 is that the 
editing process is made more efficient by providing the editor at a remotely located 
studio with an indication of the content of the audio/video material in advance of 
receiving that material. This is because the picture stamps are communication with the 

20 meta data via a wireless link so that the editor is provided with an indication of the 
content of the audio/video material in advance of receiving the audio/video material 
itself. In this way the bandwidth of the audio/video material can remain high with a 
correspondingly high quality whilst the meta data and picture stamps are at a relatively 
low. band width providing relatively low quality information. As a result of the low 

25 band width the meta data and picture stamps may be communicated via a wireless link 
on a considerably lower band width channel. This facilitates rapid communication of 
the meta data describing content of the audio/video material. 

The picture stamps generated by the meta data generation processor 128 can be 
at any point during the recorded audio/video material. In one embodiment the picture 

30 stamps are generated at the IN and OUT points of each take. However in other 
embodiments of the present invention as an activity processor 1 70 is arranged to detect 
relative activity within the video material. This is effected by performing a process in 
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which a histogram of the colour components of the images represented by the video 
signal is compiled and the rate of change of the colour components determined and 
changes in these colour components used to indicate activity within the image. 
Alternatively or in addition, motion vectors within the image are used to indicate 
5 activity. The activity processor 176 then operates to generate a signal indicative of the 
relative activity within the video material. The meta data generation processor 128 
then operates in response to the activity signal to generate picture stamps such more 
picture stamps are generated for greater activity within the images represented by the 
video signals. 

10 In an alternative embodiment of the present invention the activity processor 

170 is arranged to receive the audio signals via the connecting channel 172 and to 
recognise speech within the audio signals. The activity processor 170 then generates 
content data representative of the content of this speech as text. The text data is then 
communicated to the data processor 128 which may be stored in the data store 132 or 

15 communicated with other meta data via the communications processor 160 in a similar 
way to that already explained for the picture stamps. 

Figure 7 provides a schematic representation of a post production process in 
which the audio/video material is edited to produce an audio/video program. As 
shown in figure 7 the meta data, which may include picture stamps and/or the speech 

20 content information is communicated from the acquisition unit 152 via a separate route 
represented by a broken line 174, to a meta data database 176. The route 174 may be 
representative of a wireless communications link formed by for example UMTS, GSM 
or the like. 

The database 176 stores meta data to be associated with the audio/video 
25 material. The audio/video material in high quality form is recorded onto the tape 126. 
Thus the tape 126 is transported back to the editing suite where it is ingested by an 
ingestion processor 178. The tape identification (tape ID) recorded onto the tape 126 
or other meta data providing an indication of the content of the audio/video material is 
used to associate the meta data stored in the data store 176 with the audio/video 
30 material on the tape as indicated by the broken line 1 80. 

As will be appreciated although the example embodiment of the present 
invention uses a video tape as the recording medium for storing the audio/video 
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signals, it will be understood that alternative recording medium such as magnetic disks 
and random access memories may also be used. 
Ingestion Processor 

Figure 7 provides a schematic representation of a post production process in 
5 which the audio/video material is edited to produce an audio/video program. As 
shown in figure 7 the meta data, which may include picture stamps and/or the speech 
content information is communicated from the acquisition unit 1 52 via a separate route 
represented by a broken line 174, to a meta data database 176. The route 174 may be 
representative of a wireless communications link formed by for example UMTS, GSM 
10 or the like. 

The database 176 stores meta data to be associated with the audio/video 
material. The audio/video material in high quality form is recorded onto the tape 126. 
Thus the tape 126 is transported back to the editing suite where it is ingested by an 
ingestion processor 178. The tape identification (tape ID) recorded onto the tape 126 

1 5 or other meta data providing an indication of the 'content of the audio/video material is 
used to associate the meta data stored in the data store 176 with the audio/video 
material on the tape as indicated by the broken line 1 80. 

The ingestion processor 178 is also shown in Figure 7 to be connected to a 
network formed from a communications channel represented by a connecting line 182. 

20 The connecting line 182 represents a communications channel for communicating data 
to items of equipment, which form an inter-connected network. To this end, these 
items of equipment are provided with a network card which may operate in accordance 
with a known access technique such as Ethernet, RS422 and the like. Furthermore, as 
will be explained shortly, the communications network 182 may also provide data 

25 communications in accordance with the Serial Digital Interface (SDI) or the Serial 
Digital Transport Interface (SDTI). 

Also shown connected to the communications network 182 is the meta data 
database 176, and an audio/video server 190, into which the audio/video material is 
ingested. Furthermore, editing terminals 184, 186 are also connected to the 

30 communications channel 182 along with a digital multi-effects processor 188. 
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The communications network 182 provides access to the audio/video material 
present on tapes, discs or other recording media which are loaded into the ingestion 
processor 178. 

The meta data database 176 is arranged to receive meta data via the route 174 
5 describing the content of the audio/video material recorded on to the recording media 
loaded into the ingestion processor 178. 

As will be appreciated although in the example embodiment a video tape has 
been used as the recording medium for storing the audio/video signals, it will be 
understood that alternative recording media such as magnetic disks and random access 
10 memories may also be used, and that video tape is provided as an illustrative example 
only. 

The editing terminals 184, 186 digital multi-effects processor 188 are provided 
with access to the audio/video material recorded on to the tapes loaded into the 
ingestion processor 178 and the meta data describing this audio/video material stored 

15 in the meta data database 176 via the communications network 182. The operation of 
the ingestion processor with 178 in combination with the meta data database 176 will 
now be described in more detail. 

Figure 8 provides an example representation of the ingestion processor 178. In 
Figure 8 the ingestion processor 178 is shown to have a jog shuttle control 200 for 

20 navigating through the audio/video material recorded on the tapes loaded into video 
tape recorders/reproducers forming part of the ingestion processor 178. The ingestion 
processor 178 also includes a display screen 202 which is arranged to display picture 
stamps which describe selected parts of the audio/video material. The display screen 
202 also acts as a touch screen providing a user with the facility for selecting the 

25 audio/video material by touch. The ingestion processor 178 is also arranged to 
display all types of meta data on the screen 202 which includes script, camera type, 
lens types and UMIDs. 

As shown in Figure 9, the ingestion processor 178 may include a plurality of 
video tape recorders/reproducers into which the video tapes onto which the 

30 audio/video material is recorded may be loaded in parallel. In the example shown in 
figure 9, the video tape recorders 204 are connected to the ingestion processor 178 via 
an RS422 link and an SDI IN/OUT link. The ingestion processor 178 therefore 
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represents a data processor which can access any of the video tape recorders 204 in 
order to reproduce the audio/video material from the video tapes loaded into the video 
tape recorders. Furthermore, the ingestion processor 178 is provided with a network 
card in order to access the communications network 1 82. As will be appreciated from 
5 Figure 9 however, the communications channel 182 is comprised of a relatively low 
band width data communications channel 182' and a high band width SDI channel 
182" for use in streaming video data. Correspondingly, therefore the ingestion 
processor 178 is connected to the video tape recorders 204 via an RS422 link in order 
communicate requests for corresponding items of audio/video material. Having 

1 0 requested these items of audio/video material, the audio/video material is 
communicated back to the ingestion processor 178 via an SDI communication link 206 
- for distribution via the SDI network. The requests may for example include the UMID 
which uniquely identifies the audio/video material item(s). 

The operation of the ingestion processor in association with the meta data 

15 database 176 will now be explained with reference to figure 10. In figure 10 the meta 
data database 176 is shown to include a number of items of meta data 210 associated 
with a particular tape ID 212. As shown by the broken line headed arrow 214, the tape 
ID 212 identifies a particular video tape 216, on which the audio/video material 
corresponding to the meta data 210 is recorded. In the example embodiment shown in 

20 Figure 10, the tape ID 212 is written onto the video tape 218 in the linear time code 
area 220. However it will be appreciated that in other embodiments, the tape ID could 
be written in other places such as the vertical blanking portion. The video tape 216 is 
loaded into one of the video tape recorders 204 forming part of the ingestion processor 
178. 

25 In operation one of the editing terminals 184 is arranged to access the meta 

data database 176 via the low band width communications channel 182' the editing 
terminal 184 is therefore provided with access to the meta data 210 describing the 
content of the audio/video material recorded onto the tape 216. The meta data 210 
may include such as the copyright owner "BSkyB", the resolution of the picture and 

30 the format in which the video material is encoded, the name of the program, which is 
in this case "Grandstand", and information such as the date, time and audience. Meta 
data may further include a note of the content of the audio/video material. 
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Each of the items of audio/video material is associated with a UMID, which 
idenifies the audio/video material. As such, the editing terminal 184 can be used to 
identify and select from the meta data 210 the items of audio/video material which are 
required in order to produce a program. This material may be identified by the UMID 

5 associated with the material. In order to access the audio/video material to produce the 
program, the editing terminal 1 84 communicates a request for this material via the low 
band width communications network 182. The request includes the UMID or the 
UMIDs identifying the audio/video material item(s). In response to the request for 
audio/video material received from the editing terminal 184, the ingestion processor 

10 178 is arranged to reproduce selectively these audio/video material items identified by 
the UMID or UMIDs from the video tape recorder into which the video cassette 216 is 
loaded. This audio/video material is then streamed via the SDI network 182" back to 
the editing terminal 184 to be incorporated into the audio/video production being 
edited. The streamed audio/video material is ingested into the audio/video server 190 

15 from where the audio/video can be stored and reproduced. 

Figure 11 provides an alternative arrangement in which the meta data 210 is 
recorded onto a suitable recording medium with the audio/video material. For 
example the meta data 210 could be recorded in one- of the audio tracks of the video 
tape 218\ Alternatively, the recording medium may be an optical disc or magnetic 

20 disc allowing random access and providing a greater capacity for storing data. In this 
case the meta data 210 may be stored with the audio/video material. 

In a yet further arrangement, some or all of the meta data may be recorded onto 
the tape 216. This may be recorded, for example, into the linear recording track of the 
tape 218. Some meta data related to the meta data recorded onto the tape may be 

25 conveyed separately and stored in the database 176. A further step is required in order 
to ingest the meta data and to this end the ingestion processor 178 is arranged to read 
the meta data from the recording medium 218' and convey the meta data via the 
communications network 182' to the meta data database 176. Therefore, it will be 
appreciated that the meta data associated with the audio/video material to be ingested 

30 by the ingestion processor 178 may be ingested into the database 176 via a separate 
medium or via the recording medium on which the audio/video material is also 
recorded. 
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The meta data associated with the audio/video material may also include 
picture stamps which represent low quality representations of the images at various 
points throughout the video material. These may be presented at the touch screen 202 
on the ingestion processor 178. Furthermore these picture stamps may be conveyed 
5 via the network 182' to the editing terminals 184, 186 or the effects processor 188 to 
provide an indication of the content of the audio/video material. The editor is 
therefore provided with a pictorial representation for the audio/video material and from 
this a selection of an audio/video material items may be made. Furthermore, the 
picture stamp may stored in the database 176 as part of the meta data 210. The editor 
10 may therefore retreive a selected item for the corresponding picture stamp using the 
UMID which is associated with the picture stamp. 

In other embodiments of the invention, the recording medium may not have 
sufficient capacity to include picture stamps recorded with the audio/video material. 
This is likely to be so if the recording medium is a video tape 216. It is particularly 
1 5 appropriate in this case, although not exclusively so, to generate picture stamps before 
or during ingestion of the audio/video material. 

Returning to figure 7, in other embodiments, the ingestion processor 178 may 
include a pre-processing unit. The pre-processing unit embodied within the ingestion 
processor 178 is arranged to receive the audio/video material recorded onto the 
20 recording medium which, in the present example is a video tape 126. To this end, the 
pre-processing unit may be provided with a separate video recorder/reproducer or may 
be combined with the video tape recorder/reproducer which forms part of the ingestion 
processor 178. The pre-processing unit generates picture stamps associated with the 
audio/video material. As explained above, the picture stamps are used to provide a 
25 pictorial representation of the content of the audio/video material items. However in 
accordance with a further embodiment of the present invention the pre-processing unit 
operates to process the audio/video material and generate an. activity indicator 
representative of relative activity within the content of the audio/video material. This 
may be achieved for example using a processor which operates to generate an activity 
30 signal in accordance with a histogram of colour components within the images 
represented by the video signal and to generate the activity signals in accordance with 
a rate of change of the colour histogram components. The pre-processing unit then 
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operates to generate a picture stamp at points throughout the video material where 
there are periods of activity indicated by the activity signal. This is represented in 
Figure 12. In Figure 12A picture stamps 224 are shown to be generated along a line 
226 which is representing time within the video signal. As shown in figure 12A the 

5 picture stamps 224 are generated at times along the time line 226 where the activity 
signal represented as arrows 228 indicates events of activity. This might be for 
example someone walking into and out of the field of view of the camera where there 
is a great deal of motion represented by the video signal. To this end, the activity 
signal may also be generated using motion vectors which may be, for example, the 

10 motion vectors generated in accordance with the MPEG standard. 

In other embodiments of the invention, the pre-processor may generate textual 
information corresponding to speech present within the audio signal forming part of 
the audio/video material items stored on the tape 126. The textual information may be 
generated instead of the picture stamps or in addition to the picture stamps. In this 

15 case, text may be generated for example for the first words of sentences and/or the first 
activity of a speaker. This is detected from the audio signals present on the tape 
recording or forming part of the audio/video material. The start points where text is to 
be generated is represented along the time line 226 as arrows 230. Alternatively the 
text could be generated at the end of sentences or indeed at other points of interest 

20 within the speech. 

At the detected start of the speech, a speech processor operates to generate a 
textual representation of the content of the speech. To this end, the time line 226 
shown in Figure 12B is shown to include the text 232 corresponding to the content of 
the speech at the start of activity periods of speech. 

25 The picture stamps and textual representation of the speech activity generated 

by the pre-processor is communicated via the communications channel 1 82 to the meta 
data database 176 and stored. The picture stamps and text are stored in association 
with the UMID identifying the corresponding items of audio/video material from 
which the picture stamps 224 and the textual information 232 were generated. This 

30 therefore provides a facility to an editor operating one of the editing terminals 1 84, 1 86 
to analyse the content of the audio/video material before it is ingested using the 
ingestion processor 178. As such the video tape 126 is loaded into the ingestion 
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processor 178 and thereafter the audio/video material can be accessed via the network 
communications channel 182. The editor is therefore provided with an indication, 
very rapidly, of the content of the audio/video material and so may ingest only those 
parts of the material, which are relevant to the particular material items required by the 
5 editor. This has a particular advantage in improving the efficiency with which the 
editor may produce an audio/video production. 

In an alternative embodiment, the pre-processor may be a separate unit and 
may be provided with a screen on which the picture stamps and/or text information are 
displayed, and a means such as, for example, a touch screen, to provide a facility for 

10 selecting the audio/video material items to be ingested. 

In a further embodiment of the invention, the ingestion processor 178 generates 
meta data items such as UMIDs whilst the audio/video material is being ingested. This 
may required because the acquisition unit in the camera 152 is not arranged to generate 
UMIDs, but does generate a Unique Material Reference Number (MURN). The 

1 5 MURN is generated for each material item, such as a take. The MURN is arranged to 
be considerably shorter than a UMID and can therefore be accommodated within the 
linear time code of a video tape, which is more difficult for UMIDs because these are 
larger. Alternatively the MURN may be written into a TELEFILE (RTM) label of the 
tape. The MURN provides a unique identification of the audio/video material items 

20 present on the tape. The MURNs may be communicated separately to the database 
176 as indicated by the line 174. 

At the ingestion processor 178, the MURN for the material items are recovered 
from the tape or the TELEFILE label. For each MURN, the ingestion processor 178 
operates to generate a UMID corresponding to the MURN. The UMIDs are then 

25 communicated with the MURN to the database 176, and are ingested into the database 
in association with the MURNs, which may be already present within the database 
176. 

Camera Meta data 

The following is provided, by way of example, to illustrate the possible types 
30 of meta data generated during the production of a programme, and one possible 
organisational approach to structuring that meta data. 
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Figure 13 illustrates an example structure for organising meta data. A number 
of tables each comprising a number of fields containing meta data are provided. The 
tables may be associated with each other by way of common fields within the 
respective tables, thereby providing a relational structure. Also, the structure may 

5 comprise a number of instances of the same table to represent multiple instances of the 
object that the table may represent. The fields may be formatted in a predetermined 
manner. The size of the fields may also be predetermined. Example sizes include 
"Int" which represents 2 bytes, "Long Int" which represents 4 bytes and "Double" 
which represents 8 bytes. Alternatively, the size of the fields may be defined with 

10 reference to the number of characters to be held within the field such as, for example, 
8, 10, 16, 32, 128, and 255 characters. 

Turning to the structure in more detail, there is provided a Programme Table. 
The Programme Table comprises a number of fields including Programme ID (PID), 
Title, Working Title, Genre ID, Synopsis, Aspect Ratio, Director ID and Picturestamp. 

15 Associated with the Programme Table is a Genre Table, a Keywords Table, a Script 
Table, a People Table, a Schedule Table and a plurality of Media Object Tables. 

The Genre Table comprises a number of fields including Genre ID, which is 
associated with the Genre ID field of the Programme Table, and Genre Description. 

The Keywords Table comprises a number of fields including Programme ID, 

20 which is associated with the Programme ID field of the Programme Table, Keyword 
ID and Keyword. 

The Script Table comprises a number of fields including Script ID, Script 
Name, Script Type, Document Format, Path, Creation Date, Original Author, Version, 
Last Modified, Modified By, PID associated with Programme ED and Notes. The 
25 People Table comprises a number of fields including Image. 

The People Table is associated with a number of Individual Tables and a 
number of Group Tables. Each Individual Table comprises a number of fields 
including Image. Each Group Table comprises a number of fields including Image. 
Each Individual Table is associated with either a Production Staff Table or a Cast 
30 Table. 

The Production Staff Table comprises a number of fields including Production 
Staff ID, Surname, Firstname, Contract ID, Agent, Agency ID, E-mail, Address, Phone 
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Number, Role ID, Notes, Allergies, DOB, National Insurance Number and Bank ID 
and Picture Stamp. 

The Cast Table comprises a number of fields including Cast ID, Surname, 
Firstname, Character Name, Contract ID, Agent, Agency ID, Equity Number, E-mail, 
5 Address, Phone Number, DOB and Bank ID and Picture Stamp. Associated with the 
Production Staff Table and Cast Table are a Bank Details Table and an Agency Table. 

The Bank Details Table comprises a number of fields including Bank ID, 
which is associated with the Bank ID field of the Production Staff Table and the Bank 
ID field of the Cast Table, Sort Code, Account Number and Account Name. 
10 The Agency Table comprises a number of fields including Agency ID, which is 

associated with the Agency ID field of the Production Staff Table and the Agency ID 
field of the Cast Table, Name, Address, Phone Number, Web Site and E-mail and a 
Picture Stamp. Also associated with the Production Staff Table is a Role Table. 

The Role Table comprises a number of fields including Role ID, which is 
15 associated with the Role ID field of the Production Staff Table, Function and Notes 
and a Picture Stamp. Each Group Table is associated with an Organisation Table. 

The Organisation Table comprises a number fields including Organisation ID, 
Name, Type, Address, Contract ID, Contact Name, Contact Phone Number and Web 
Site and a Picture Stamp. 
20 Each Media Object Table comprises a number of fields including Media Object 

ID, Name, Description, Picturestamp, PID, Format, schedule ID, script ID and Master 
ID. Associated with each Media Object Table is the People Table, a Master Table, a 
Schedule Table, a Storyboard Table, a script table and a number of Shot Tables. 

The Master Table comprises a number of fields including Master ID, which is 
25 associated with the Master ID field of the Media Object Table, Title, Basic UMID, 
EDL ID, Tape ID and Duration and a Picture Stamp. 

The Schedule Table comprises a number of fields including Schedule ID, 
Schedule Name, Document Format, Path, Creation Date, Original Author, Start Date, 
End Date, Version, Last Modified, Modified By and Notes and PID which is 
30 associated with the programme ID. 
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The contract table contains: a contract ID which is associated with the contract 
ID of the Production staff, cast, and organisation tables; commencement date, rate, job 
title, expiry date and details. 

The Storyboard Table comprises a number of fields including Storyboard ID, 
5 which is associated with the Storyboard ID of the shot Table, Description, Author, 
Path and Media ID. 

Each Shot Table comprises a number of fields including Shot ID, PID, Media 
ID, Title, Location ID, Notes, Picturestamp, script ID, schedule ID, and description. 
Associated with each Shot Table is the People Table, the Schedule Table, script table, 

10 a Location Table and a number of Take Tables. 

The Location Table comprises a number of fields including Location ID, which 
. is associated with the Location ID field of the Shot Table, GPS, Address, Description, 
Name, Cost Per Hour, Directions, Contact Name, Contact Address and Contact Phone 
Number and a Picture Stamp. 

15 Each Take Table comprises a number of fields including Basic UMID, Take 

Number, Shot ID, Media ID, Timecode IN, Timecode OUT, Sign Meta data, Tape ID, 
Camera ID, Head Hours, Videographer, IN Stamp, OUT Stamp. Lens ID, AUTOID 
ingest ID and Notes. Associated with each Take Table is a Tape Table, a Task Table, 
a Camera Table, a lens table, an ingest table and a number of Take Annotation Tables. 

20 The Ingest table contains an Ingest ID which is associated with the Ingest Id in 

the take table and a description. 

The Tape Table comprises a number of fields including Tape ID, which is 
associated with the Tape ID field of the Take Table, PID, Format, Max Duration, First 
Usage, Max Erasures, Current Erasure, ETA ( estimated time of arrival) and Last 

25 Erasure Date and a Picture Stamp. 

The Task Table comprises a number of fields including Task ID, PID, Media 
ID, Shot ID, which are associated with the Media ID and Shot ID fields respectively of 
the Take Table, Title, Task Notes, Distribution List and CC List. Associated with the 
Task Table is a Planned Shot Table. 

30 The Planned Shot Table comprises a number of fields including Planned Shot 

ED, PID, Media ID, Shot ED, which are associated with the PID, Media ID and Shot ID 
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respectively of the Task Table, Director, Shot Title, Location, Notes, Description, 
Videographer, Due date, Programme title, media title Aspect Ratio and Format. 

The Camera Table comprises a number of fields including Camera ID, which is 
associated with the Camera ID field of the Take Table, Manufacturer, Model, Format, 
5 Serial Number, Head Hours, Lens ID, Notes, Contact Name, Contact Address and 
Contact Phone Number and a Picture Stamp. 

The Lens Table comprises a number of fields including Lens ID, which is 
associated with the Lens ID field of the Take Table, Manufacturer, Model, Serial 
Number, Contact Name, Contact Address and Contact Phone Number and a Picture 
10 Stamp. 

Each Take Annotation Table comprises a number of fields including Take 
Annotation ID, Basic UMID, Timecode, Shutter Speed, Iris, Zoom, Gamma, Shot 
Marker ID, Filter Wheel, Detail and Gain. Associated with each Take Annotation 
Table is a Shot Marker Table. 
1 5 The Shot Marker Table comprises a number of fields including Shot Marker 

ID, which is associated with the Shot Marker ID of the Take Annotation Table, and 
Description. 
UMID Description 

A UMID is described in SMPTE Journal March 2000 which provides details of 
20 the UMID standard. Referring to figures 14 and 15, a basic and an extended UMID 
are shown. It comprises a first set of 32 bytes of basic UMID and a second set of 32 
bytes of signature meta data. 

The first set of 32 bytes is the basic UMID. The components are: 
•A 12-byte Universal Label to identify this as a SMPTE UMID. It defines the 
25 type of material which the UMID identifies and also defines the methods by which the 
globally unique Material and locally unique Instance numbers are created. 

•A 1-byte length value to define the length of the remaining part of the UMID. 
•A 3 -byte Instance number which is used to distinguish between different 
'instances' of material with the same Material number. 
30 *A 16-byte Material number which is used to identify each clip. Each Material 

number is the same for related instances of the same material. 
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The second set of 32 bytes of the signature meta data as a set of packed meta 
data items used to create an extended UMID. The extended UMID comprises the 
basic UMID followed immediately by signature meta data which comprises: 

•An 8-byte time/date code identifying the time and date of the Content Unit 
5 creation. 

•A 12-byte value which defines the spatial co-ordinates at the time of Content 
Unit creation. 

•3 groups of 4-byte codes which register the country, organisation and user 

codes 

1 0 Each component of the basic and extended UMIDs will now be defined in turn. 

The 12-byte Universal Label 



The first 12 bytes of the UMID provide identification of the UMID by the 
registered string value defined in table 1. 



Byte No. 


Description 


Value (hex) 


1 


Object Identifier j 


06h 


2 


Label size 


OCh 


3 


Designation: ISO 


2Bh 


4 


Designation: SMPTE 


34h 


5 


Registry: Dictionaries 


Olh 


6 


Registry: Meta data Dictionaries 


Olh 


7 


Standard: Dictionary Number 


Olh 


8 


Version number 


Olh 


9 


Class: Identification and location 


Olh 


10 


Sub-class: Globally Unique Identifiers 


Olh 


11 


Type: UMID (Picture, Audio, Data, Group) 


01, 02, 03, 04h 


12 


Type: Number creation method 


XXh 



15 Table 1: Specification of the UMID Universal Label 

The hex values in table 1 may be changed: the values given are examples. 
Also the bytes 1-12 may have designations other than those shown by way of example 
in the table. Referring to the Table 1 , in the example shown byte 4 indicates that bytes 



BNSDOC1D <WO 0175884A2_I_> 



WO 01/75884 PCT/CB01/01452 

36 

5-12 relate to a data format agreed by SMPTE. Byte 5 indicates that bytes 6 to 10 
relate to "dictionary" data. Byte 6 indicates that such data is "meta data" defined by 
bytes 7 to 10. Byte 7 indicates the part of the dictionary containing meta data defined 
by bytes 9 and 10. Byte 10 indicates the version of the dictionary. Byte 9 indicates 
5 the class of data and Byte 10 indicates a particular item in the class. 

In the present embodiment bytes 1 to 10 have fixed pre-assigned values. Byte 
1 1 is variable. Thus referring to Figure 15, and to Table 1 above, it will be noted that 
the bytes 1 to 10 of the label of the UMID are fixed. Therefore they may be replaced 
by a 1 byte 'Type 5 code T representing the bytes 1 to 10. The type code T is followed 
10 by a length code L. That is followed by 2 bytes, one of which is byte 11 of Table 1 
and the other of which is byte 12 of Table 1, an instance number (3 bytes) and a 
material number (16 bytes). Optionally the material number may be followed by the 
signature meta data of the extended UMID and/or other meta data. 

The UMID type (byte 1 1) has 4 separate values to identify each of 4 different 
1 5 data types as follows: 

' 0 1 h' = UMID for Picture material 

'02h' = UMID for Audio material 

6 03h' = UMID for Data material 

*04h' = UMID for Group material (i.e. a combination of related essence). 
20 The last (12th) byte of the 12 byte label identifies the methods by which the 

material and instance numbers are created. This byte is divided into top and bottom 
nibbles where the top nibble defines the method of Material number creation and the 
bottom nibble defines the method of Instance number creation. 

Length 

25 The Length is a 1-byte number with the value ' 13h' for basic UMIDs and '33h' 

for extended UMIDs. 

Instance Number 

The Instance number is a unique 3 -byte number which is created by one of 
several means defined by the standard. It provides the link between a particular 
30 'instance' of a clip and externally associated meta data. Without this instance number, 
all material could be linked to any instance of the material and its associated meta data. 
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The creation of a new clip requires the creation of a new Material number 
together with a zero Instance number. Therefore, a non-zero Instance number 
indicates that the associated clip is not the source material. An Instance number is 
primarily used to identify associated meta data related to any particular instance of a 
5 clip. 

Material Number 

The 16-byte Material number is a non-zero number created by one of several 
means identified in the standard. The number is dependent on a 6-byte registered port 
ID number, time and a random number generator. 
1 0 Signature Meta data 

Any component from the signature meta data may be null-filled where no 
meaningful value can be entered. Any null-filled component is wholly null-filled to 
clearly indicate a downstream decoder that the component is not valid. 
The Time-Date Format 
15 The date-time format is 8 bytes where the first 4 bytes are a UTC (Universal 

Time Code) based time component. The time is defined either by an AES3 32-bit 
audio sample clock or SMPTE 12M depending on the essence type. 

The second 4 bytes define the date based on the Modified Julian Data (MJD) as 
defined in SMPTE 309M. This counts up to 999,999 days after midnight on the 17th 
20 November 1 858 and allows dates to the year 4597. 
The Spatial Co-ordinate Format 

The spatial co-ordinate value consists of three components defined as follows: 

•Altitude: 8 decimal numbers specifying up to 99,999,999 metres. 

•Longitude: 8 decimal numbers specifying East/West 180.00000 degrees (5 
25 decimal places active). 

•Latitude: 8 decimal numbers specifying North/South 90.00000 degrees (5 
decimal places active). 

The Altitude value is expressed as a value in metres from the centre of the earth 
thus allowing altitudes below the sea level. 
30 It should be noted that although spatial co-ordinates are static for most clips, 

this is not true for all cases. Material captured from a moving source such as a camera 
mounted on a vehicle may show changing spatial co-ordinate values. 



BNSDOCID: <WO 0175884A2_I_> 



WO 01/75884 



38 



PCT/GB01/01452 



Country Code 

The Country code is an abbreviated 4-byte alpha-numeric string according to 
the set defined in ISO 3166. Countries which are not registered can obtain a registered 
alpha-numeric string from the SMPTE Registration Authority. 
5 Organisation Code 

The Organisation code is an abbreviated 4-byte alpha-numeric string registered 
with SMPTE. Organisation codes have meaning only in relation to their registered 
Country code so that Organisation codes can have the same value in different 
countries. 
10 User Code 

The User code is a 4-byte alpha-numeric string assigned locally by each 
. organisation and is not globally registered. User codes are defined in relation to their 
registered Organisation and Country codes so that User codes may have the same 
value in different organisations and countries. 
15 Freelance Operators 

Freelance operators may use their country of domicile for the country code and 
use the Organisation and User codes concatenated to e.g. an 8 byte code which can be 
registered with SMPTE. These freelance codes may start with the c ~' symbol ( ISO 
8859 character number 7Eh) and followed by a registered 7 digit alphanumeric string. 
20 As will be appreciated by those skilled in the art various modifications may be 

made to the embodiments herein before described without departing from the scope of 
the present invention. For example whilst embodiments have been described with 
recording audio/video onto magnetic tape, it will be appreciated that other recording 
media are possible. 

25 Having regard to the description of example embodiments of the invention 

described above, it will be appreciated that a further aspect of the present invention 
provides a video processing apparatus and an audio processing apparatus for 
processing video signals representing images and audio signals representing sound, 
data video and audio processing apparatus comprising an activity detector which is 

30 arranged in operation to receive the video signals and the audio signals respectively 
and to generate an activity signal indicative of an amount of activity within the images 
represented by the video signal, and the sound within the audio signal respectively, and 
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a meta data generator coupled to the activity detector which is arranged in operation to 
receive the video signal and the audio signal respectively and the activity signal and to 
generate meta data representative of the content of the video signals and audio signals 
at temporal positions within the video signal and audio signal respectively, which 
5 temporal positions are determined from data activity signal. 

As will be appreciated those features of the invention which appear in the 
example embodiments as a data processor or processing units could be implemented in 
hard ware as well as a software computer program running on an appropriate data 
processor. Correspondingly those aspects and features of the invention which are 

10 described as computer or application programs running on a data processor may be 
implemented as dedicated hardware. It will therefore be appreciated that a computer 
. program running on a data processor which serves to form an audio and/or video 
generation apparatus as herein before described is an aspect of the present invention. 
Similarly a computer program recorded onto a recordable medium which serves to 

15 define the method according to the present invention or when loaded onto a computer 
forms an apparatus according to the present invention are aspects of the present 
invention. 

Whilst the embodiments described above each include explicitly recited 
combinations of features according to different aspects of the present invention, other 

20 embodiments are envisaged according to the general teaching of the invention, which 
include combinations of features as appropriate, other than those explicitly recited in 
the embodiments described above. Accordingly, it will be appreciated that different 
combinations of features of the appended independent and dependent claims form 
further aspects of the invention other than those, which are explicitly recited in the 

25 claims. 
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CLAIMS 

1. An audio/video reproducing apparatus connectable to a 
communications network for selectively reproducing items of audio/video material 
from a recording medium in response to a request received via said communications 

5 network. 

2. An audio/video reproducing apparatus as claimed in Claim 1, 
comprising 

- a control processor which is arranged in use to receive data representing said 
10 request for said audio/video material item via said communications network, and 

- a reproducing processor coupled to the control processor and arranged in 
response to signals identifying said audio/video material items from said control 
processor to reproduce said audio/video material items, which are communicated via 
said communications network. 

15 

3. An audio/video reproducing apparatus as claimed in Claim 1 or 2, 

- a first network interface connectable to a first communications network for 
receiving said data representing said requests for said audio/video material items, and 

- a second network interface connectable to a second communications network 
20 for communicating said items of audio/video material. 

4. An audio/video reproducing apparatus as claimed in any preceding 
Claim, wherein said first network interface is arranged to operate in accordance with a 
data communications network standard such as Ethernet, RS 322 or RS 422 or the like. 

25 

5. An audio/video reproducing apparatus as claimed in any of Claims 3 or 
4, wherein said second network interface is arranged to operates in accordance with 
the Serial Digital Interface (SDI) or the Serial Digital Transport Interface (SDTI). 
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6. An audio/video reproducing apparatus as claimed in any preceding 
Claim, wherein said data representing requests for audio/Video material itenis includes 
meta data indicative of the audio/video material items. 

5 7. An audio/video reproducing apparatus as claimed in any preceding 

Claim, wherein said meta data is at least one of UMID, tape ID and time codes, and a 
Unique Material Reference Number, identifying the material items. 

8. An audio/video reproducing apparatus as claimed in any of Claims 2 to 
10 7, wherein said reproducing apparatus comprises a plurality of audio/video 

recording/reproducing apparatus each of which is coupled to said control processor via 
a local data bus. 

9. An audio/video reproducing apparatus as claimed in Claim 8, wherein 
1 5 said local bus includes a control communications channel for communicating control 

data to and/or from said control processor, and video data communications channel for 
communicating said items of audio/video material from said plurality of audio/video 
recording/reproducing apparatus to said communications network. 

20 10. An audio/video reproducing apparatus as claimed in any preceding 

Claim, comprising 

- a display device which is arranged in operation to display images 
representative of said audio/video material items present on said recording medium. 

25 11. An audio/video reproducing apparatus as claimed in Claim 10, wherein 

said display device is a touch screen coupled to said control processor, and arrange in 
use to receive touch commands from a user for selecting said items of audio/video 
material. 

30 12. An audio/video reproducing apparatus as claimed in any of Claims 2 to 

11, wherein said control processor is arranged to generate data representing a material 
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identifier for each of said audio/video material items, from data recorded with said 
audio/video material items on said recording medium. • 

13. An audio/video reproducing apparatus as claimed in Claim 12, wherein 
5 said material identifier is a UMID or the like. 

14. A method of reproducing items of audio/video material from a 
recording medium, comprising the steps of 

- communicating an identification of a selected item of audio/video material via 
10 a communications network, 

- receiving said identification at an audio/video reproducing apparatus in which 
said recording medium is loaded, and 

- selectively reproducing said item of audio/video material from said recording 
medium in accordance with said identification. 

15 

15. A video processing apparatus for processing video signals representing 
images comprising 

- an activity detector which is arranged in operation to receive said video 
signals and to generate an activity signal indicative of an amount of activity within the 

20 images represented by the video signed, and 

- an image generator coupled to the activity detector which is arranged in 
operation to receive said video signal and said activity signal and to generate sample 
images at temporal positions within said video signal, which temporal positions are 
determined from said activity signal. 

25 

16. A video processing apparatus as claimed in Claim 15, wherein said 
activity signal is representative of a relative amount of activity within the images 
represented by said video signal and said image detector is arranged in operation to 
produce more of said sample images during periods of greater activity indicated by 

30 said activity signal. 
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17. A video processing apparatus as claimed in Claims 15 or 16, wherein 
said sample images are represented by a substantially reduced amount of data in 
comparison to said images represented by said video signal. 

5 18. A video processing apparatus as claimed in any of Claims 15 to 17, 

comprising 

- a reproduction processor which is arranged in operation to receive a recording 
medium on which said video signals are recorded and to reproduce said video signals 
from said recording medium. 

10 

19. A video processing apparatus as claimed in Claim 18, wherein said 
. image generator is arranged in operation to generate, for each of said sample images a 

material identification representative of a location on said recording medium where the 
video signal corresponding to said sample images are recorded. 

15 

20. A video processing apparatus as claimed in any of Claims 15 to 19, 
comprising 

- a display device for displaying said sample images. 

20 21. A video processing apparatus as claimed in Claim 20, wherein said 

display device is arranged to display said sample images at locations on said display 
device which are representative of the location on said recording medium at which said 
sample images are recorded. 

25 22. A video processing apparatus as claimed in any of Claims 15 to 21, 

comprising 

- a communications processor which is arranged in operation to communicate 
said sample images. 

30 23. A video processing apparatus as claimed in any of Claims 15 to 22, 

wherein said activity detector generates said activity signal by forming a histogram of 
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colour components of said video image and determining a rate of change of said colour 
components. 

24. A video processing apparatus as claimed in any of Claims 15 to 23, 
5 wherein said activity detector generates said activity signal from motion vectors of 

image components of said video image signal. 

25. An editing system having a database connected to a communications 
channel and a video processor as claimed in any of Claims 22, 23 or 24, connected to 

10 said communications channel via the communications processor, said communications 
processor being arranged in operation to communicate said sample images to said 
. database, in which said sample images are stored. 

26. An audio processing apparatus for processing audio signals 
15 representative of sound, said audio processing apparatus comprising 

- a speech analysis processor which is arranged in operation to generate speech 
data identifying speech detected within said audio signals, 

- an activity processor coupled to said speech analysis processor and arranged 
in operation to generate an activity signal in response to said speech data, and 

20 - a content information generator, coupled to said activity processor and said 

speech analysis processor and arranged in operation to generate data representing the 
content of said speech at temporal positions within said audio signal determined by 
said activity signal. 

25 27. An audio processing apparatus as claimed in Claim 26, wherein said 

activity signal is indicative of the start of a speech sentence. 

28. An audio processing apparatus as claimed in Claims 26 or 27, 
comprising 

30 - a reproduction processor which is arranged in operation to receive a recording 

medium on which said audio signals are recorded and to reproduce said audio signals 
from said recording medium. 
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29. An audio processing apparatus as claimed in any of Claims 26, 27 or 
28, wherein said content information generator is arranged in operation to generate, for 
each of said sample images a material identification representative of a location on 

5 said recording medium where the audio signals corresponding to said content data are 
recorded. 

30. An audio processing apparatus as claimed in any of Claims 26 to 29, 
wherein said content data is representative of text corresponding to the content of the 

10 speech. 

31. An audio processing apparatus as claimed in Claim 30, comprising 

- a display device for displaying said text. 

15 32. An audio processing apparatus as claimed in Claim 31, wherein said 

display device is arranged to display said text with respect a location on said display 
device which is representative of a location on said recording medium at which said 
text is recorded. 

20 33. An audio processing apparatus as claimed in any of Claims 26 to 32, 

comprising 

- a communications processor which is arranged in operation to communicate 
said content data. 

25 34. An editing system having a database connected to a communications 

channel and an audio processor as claimed in Claim 33, connected to said 
communications channel via the communications processor, said communications 
processor being arranged in operation to communicate said content data to said 
database, in which said sample images are stored. 

30 

35. An audio/video processing apparatus comprising 

- a video processing apparatus as claimed in any of Claims 15 to 24, and 
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- an audio processing apparatus as claimed in any of Claims 26 to 33. 

36. A method of processing video signals comprising the steps of 

- generating an activity signal indicative of an amount of activity within the 
5 images represented by the video signal, and 

- generating sample images at temporal positions within said video signal, 
which temporal positions are determined from said activity signal. 

37. A method of processing audio signals representative of sound, said 
10' method comprising the steps of 

- generating speech data identifying speech detected within said audio signals, 

- generating an activity signal in response to said speech data, and 

- generating data representing the content of said speech at temporal positions 
within said audio signal determined by said activity signal. 

15 

38. A system for editing audio/video productions comprising 

- an ingestion processor having means for receiving a recording medium and 
being arranged in use to reproduce audio/video material items from said recording 
medium, 

20 - a data base operable to receive and to store meta data describing the contents 

of said audio/video material items on said recording medium, and 

- an editing processor coupled to said ingestion processor and said data base, 
said editing processor having a graphical user interface for displaying a representation 
of said meta data stored in said data base and for selecting said audio/video material 

25 items from said displayed representation of said meta data, said editing processor 
being arranged to combine user selected items of audio/video material, which are 
selectively reproduced by said ingestion processor in response to meta data 
corresponding to said selected items of audio/video material being communicated to 
said ingestion processor by said editing processor. 

30 
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39. A system as claimed in Claim 38, wherein said editing processor is 
coupled to said data base and said audio/video reproducing apparatus via a data 
communications network. 



5 40. A system as claimed in Claim 39, wherein said data communications 

network comprises 

- a first communications channel coupled to said editing station, said data base 
and said ingestion processor for communicating said meta data, and 

- a second communications channel coupled to said editing station, said data 
10 base and said ingestion processor for communicating said items of audio/video 

material. 

41. A system as claimed in Claim 40, wherein said first network interface is 
arranged to operate in accordance with a data communications network standard such 

1 5 as Ethernet, RS 322 or RS 422 or the like. 

42. A system as claimed in Claims 40 or 41, wherein said second network 
interface is arranged to operates in accordance with the Serial Digital Interface (SDI) 
or the Serial Digital Transport Interface (SDTI). 

20 

43. A system as claimed in any of Claims 38 to 42, wherein said meta data 
includes at least one of UMID, tape ID and time codes, and a Unique Material 
Reference Number, identifying the material items. 

25 44. A system as claimed in any of Claims 38 to 43, wherein said meta data 

includes sample images representing the content of the audio/video material items at 
sample temporal positions within said audio/video material items. 

45. A system as claimed in any of Claims 38 to 44, wherein said recording 
30 medium includes said meta data describing the content of the audio/video material 
items recorded on to said recording medium, and said ingestion processor is arranged 
in operation to reproduce said meta data and to communicate said meta data via said 
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network to said data base, said data base operating to receive and to store said meta 
data. 

46. A method of generating an audio/video production by selecting and 
5 combining items of meta data, said method comprising the steps of 

- loading a recording medium on which items of audio/video material are 
recorded into an ingestion processor; 

- reviewing meta data describing the content of the audio/video material items 
on said recording medium; and consequent upon said review 

10 - selectively retrieving items of audio/video material from said recording 

medium to form said audio/video production. 

47. A method as claimed in Claim 46, comprising the step of 

- loading meta data describing the content of the audio/video material items 
1 5 into a data base; the step of reviewing the meta data comprising the step of 

- interrogating said data base. 

48. A method as claimed in Claim 47, wherein said meta data is present on 
said recording medium with said items of audio/video material, and said method 

20 further comprises the steps of 

- ingesting said meta data using said ingestion processor, 

- communicating said meta data to said data base, and 

- storing said meta data in said data base. 

25 49. A computer program providing computer executable instructions, 

which when loaded onto a data processor configures the data processor to operate as 
an audio/video reproducing apparatus according to any of Claims 1 to 13, or 35, or a 
video processing apparatus according to any of Claims 15 to 24, or an audio 
processing apparatus according to any of Claims 26 to 33, or an editing system 

30 according to any of Claims 25, 34 or 38 to 45. 
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50. A computer program providing computer executable instructions, 
which when loaded on to a data processor causes the data processor to perform the 
method according to any of Claims 14, 36, 37 or Claims 46 to 48. 

5 51. A computer program product having a computer readable medium 

recorded thereon information signals representative of the computer program claimed 
in any of Claims 49 or 50. 

52. A signal representing audio and/or video material produced by the an 
10 audio/video reproducing apparatus according to any of Claims 1 to 13, or 35, or the 

sample images produced by the video processing apparatus according to any of Claims 
15 to 24, or the data representing the content of the speech produced by the audio 
processing apparatus according to any of Claims 26 to 33, or an audio/video 
production produced by the editing system according to any of Claims 25, 34 or 38 to 
15 45. 

53. A data carrier on which is recorded data representing audio and/or 
video material produced by the an audio/video reproducing apparatus according to any 
of Claims 1 to 13, or 35, or the sample images produced by the video processing 

20 apparatus according to any of Claims 15 to 24, or the data representing the content of 
the speech produced by the audio processing apparatus according to any of Claims 26 
to. 33, or an audio/video production produced by the editing system according to any of 
Claims 25, 34 or 38 to 45. 

25 54. A system for editing audio/video material items as herein before 

described with reference to the accompanying drawings. 

55. A method of reproducing items of audio/video material as herein before 
described with reference to the accompanying drawings. 

30 

56. An audio/video reproducing apparatus as herein before described with 
reference to the accompanying drawings. 
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57. An audio or a video processing apparatus as herein before described 
with reference to the accompanying drawings. 

58. A method of processing items of audio/video material as herein before 
described with reference to the accompanying drawings. 
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